hysop.backend.device.opencl.autotunable_kernels.transpose module¶
- class hysop.backend.device.opencl.autotunable_kernels.transpose.OpenClAutotunableTransposeKernel(cl_env, typegen, build_opts, autotuner_config, **kwds)[source]¶
Bases:
OpenClAutotunableKernel
Autotunable interface for transpose kernel code generators.
- autotune(is_inplace, input_buffer, output_buffer, axes, hardcode_arrays, name=None, **kwds)[source]¶
Autotune this kernel with specified axes, inputs and outputs.
- compute_args_mapping(extra_kwds, extra_parameters)[source]¶
Return arguments mapping which is a dictionnary with arguments names as keys and tuples a values.
Tuples should contain (arg_position, arg_type(s)) with arg_position being an int and arg_type(s) a type or tuple of types which will be checked against.
- compute_global_work_size(local_work_size, work, extra_parameters, extra_kwds)[source]¶
Compute aligned global_work_size from unaligned global_work_size and local_work_size. Input global_work_size may be None.
- compute_work_candidates(work_bounds, work_load, extra_parameters, extra_kwds)[source]¶
Configure work (global_size, local_size candidates) given a OpenClWorkBoundsConfiguration object and a work_load.
Return a WorkConfiguration object.
Notes
global_work_size can be ignored if it depends on local_work_size and will be set in self.compute_global_work_size().